On Anonymization of String Data

نویسندگان

  • Charu C. Aggarwal
  • Philip S. Yu
چکیده

String data is especially important in the privacy preserving data mining domain because most DNA and biological data is coded as strings. In this paper, we will discuss a new method for privacy preserving mining of string data with the use of simple template based condensation models. The template based model turns out to be effective in practice, and preserves important statistical characteristics of the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Location Data Privacy Preservation in Mobile Networks

Uncontrolled variety of location based service (LBS) applications in mobile networks led to the loss of individual user location data privacy. To preserve privacy in mobile networks data randomization and anonymization must be used because of location data uncertainty. By data randomization additional random data is added to the original location data, desensitizing the precise information cont...

متن کامل

Anagrams for String Database Anonymity

In this paper we focus on the privacy preserving publication of string data. String data are common to variety of application fields ranging from biological data (with DNA and RNA sequences) to web logs, where users’ sessions are modeled as sequences of user actions. The existence of structural information in each record, i.e. the sequence of terms, renders the problem of anonymizing collection...

متن کامل

Impact of Pollution Location on Time and Frequency Characteristics of Leakage Current of Porcelain Insulator String under Different Humidity and Contamination Severity

One of the important factors influencing outdoor insulators performance is pollution phenomenon. The pollution, especially during humidity condition, reduces superficial resistance of insulator and lead to a flow of Leakage Currents (LC) on the insulator surface, which may result in total flashover. The LC characteristics are affected by parameters such as nature and severity of pollution. Loca...

متن کامل

A Survey on Data Anonymization Techniques for Large Data Sets

Data anonymization is used to remove user specific information from published data sets. Different kinds of anonymization techniques are used to eliminate various types of attacks. Anonymization process modifies the data into human unidentifiable form and it is most efficient than any other privacy preserving techniques like encryption etc. Encryption is costly when compared to anonymization as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007